Process for the specific isotopic labeling of methyl groups of Val, Leu and Ile

ABSTRACT

The invention relates to a process for the specific isotopic labeling of Valine, Leucine and Isoleucine amino acids. The process of the invention uses a 2-alkyl-2-hydroxy-3-oxobutanoic acid in which the alkyl substituent in position 2 is ethyl or methyl. The invention can be used for the analysis of proteins, in particular by NMR.

The invention relates to a process for the specific isotopic labeling of Valine, Leucine and Isoleucine amino acids and more particularly for the stereospecific labeling of methyl groups of Leucine and Valine as well as specific labeling of the γ2 methyl groups of Isoleucine in proteins and biomolecular assemblies.

It also relates to specifically methyl labeled 2-hydroxy-2-methyl-3-oxobutanoic acid and 2-ethyl-2-hydroxy-3-oxobutanoic acid (named as acetolactate derivatives in the following) used in this process and to a process for manufacturing such specifically methyl labeled acetolactate derivatives.

NMR spectroscopy is an established and powerful method for structural determination and to map biomolecular interactions in complexes with affinities ranging from low nanomolar range to a few millimolar range.

But poor resolution in NMR spectra remains a major limiting factor to the application of this method to large molecular assemblies such as proteins of molecular weight up to 1 megadalton, even though recent progress in NMR spectroscopy of high molecular weight proteins have been made. This progress is strongly connected to the development of new isotopic labeling schemes. The combination of selected protonation of methyl groups in fully perdeuterated proteins with transverse relaxation optimized methyl spectroscopy (methyl-TROSY) has allowed local structure and dynamic proteins assemblies of up to 1 megadalton to be studied by solution NMR spectroscopy.

Such labeling protocols rely on the addition of specific [¹H¹³C] methyl labeled biosynthetic precursors as the sole proton source in a perdeuterated culture medium.

This approach provides a high level of methyl protonation without detectable isotopic scrambling.

Valine (Val) and Leucine (Leu) and Isoleucine (Ile) are amino acids of great interest as their methyl groups account for more than 50% of the total methyl probes available in proteins.

Protonation of Leu and Val methyl groups in perdeuterated proteins is commonly achieved using methyl protonated 2-oxo-3-methylbutanoic acid (named α-ketoisovalerate in the following) an intermediate in the biosynthesis of such amino acids.

However, in particular for large proteins of 30 kDa to 1 megaDa, this labeling strategy can result in overcrowded [¹H¹³C]-correlation spectra due to the sheer number of NMR-visible methyl-probes. Overlap in NMR spectra can greatly complicate the measurement of site-specific structural or relaxion parameters.

The invention aims to overcome the drawbacks of this method by providing a process for specific isotopic labeling of Valine, Leucine and Isoleucine amino acids and more particularly for the labeling of methyl groups of Leucine, Valine and Isoleucine in recombinant perdeuterated proteins.

This process offers a significant enhancement in the resolution and sensitivity of [¹³C¹H]-methyl TROSY spectra and extends the capacity for detecting long-range structurally meaningful distance restraints in large proteins. This method is particularly useful for investigation of molecular interactions within large biomolecular assemblies.

Furthermore this process allows for proteins of little and medium size (i.e. of less than 100 kDa) a significant improvement for the stereospecific assignment of methyl groups of the Leucine and Valine aminoacids compared to precedent methods using mixture of unlabeled and labeled precursors (glucose or pyruvate). This last method is described in D. Neri, T. Szyperski, G. Otting, H. Senn, K. Wüthrich, Biochemistry 1989, 28, 7510-7516.

For this end, the invention proposes a process for the specific isotopic labeling of amino acids selected from the group consisting of Val, Leu, and Ile, and more particularly for the stereospecific labeling of methyl groups of Leu and Val as well as specific labeling of the γ2 methyl groups of Isoleucine in proteins and biomolecular assemblies comprising the following step a):

introducing, in a medium containing bacteria overexpressing a protein, a 2-alkyl-2-hydroxy-3-oxobutanoic acid, wherein the alkyl substituent in position 2 is ethyl or methyl, also called acetolactate derivatives in the following and having the following formula:

wherein:

-   -   X is ¹H or ²H (D),     -   each Y is independently from the others ¹²C or ¹³C,     -   R¹ is a methyl group in which the carbon atom is ¹³C or ¹²C and         the hydrogen atoms are independently from each other ¹H or ²H         (D),     -   R² is either a methyl group in which the carbon atom is ¹³C or         ¹²C and the hydrogen atoms are independently from each other ¹H         or ²H (D), or an ethyl group in which the carbon atoms are         independently from each other ¹³C or ¹²C and the hydrogen atoms         are independently from each other ¹H or ²H (D),         at the provisos that:

1) the hydrogen atoms of the acetolactate derivative of Formula I are not all, at the same time, either ¹H or ²H (D),

2) the hydrogen atoms of R¹ and the hydrogen atoms of R² are not all, at the same time, either ¹H or ²H (D).

In Formula I, X is an exchangeable hydrogen which can be ¹H or ²H depending on the nature of the solvent.

This process furthermore comprises the following steps:

b) overexpression of the protein by the bacteria contained in the medium, and

c) purification of the protein.

In a first preferred embodiment of the process of labeling of the invention, the acetolactate derivative has the Formula I in which:

-   -   R¹ is chosen among the following groups:     -   ¹²CH₃, ¹²CD₃, ¹³CH₃, ¹³CD₃, ¹³CHD₂, ¹³CH₂D,     -   R² is chosen among the following groups:     -   ¹²CH₃, ¹²CD₃, ¹³CH₃, ¹³CD₃, ¹³CHD₂, ¹³CH₂D, ¹²CH₃ ¹²CD₂, ¹²CD₃         ¹²CD₂, ¹³CH₃ ¹²CD₂, ¹³CH₃ ¹³CD₂, ¹³CD₃ ¹³CD₂, ¹³CHD₂ ¹³CD₂,         ¹³CH₂D¹³CD₂, ¹³CHD₂ ¹²CD₂, ¹³CH₂D¹²CD₂,     -   each Y is independently from the others ¹²C or ¹³C,         at the provisos that:

1) the hydrogen atoms of the acetolactate derivative of Formula I are not all, at the same time, either ¹H or ²H (D),

2) the hydrogen atoms of R¹ and the hydrogen atoms of R² are not all, at the same time, either ¹H or ²H (D).

In a second preferred embodiment of the process of labeling of the invention, the acetolactate derivative is selected in the group of compounds having the following formulae:

-   4=2-hydroxy-2-(¹³C)methyl-3-oxo-4(²H₃)butanoic acid, -   5=2-hydroxy-2-(²H₃)methyl-3-oxo-4(¹³C)butanoic acid, -   6=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(¹³C)methylbutanoic acid, -   9=1,2,3,4-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxobutanoic acid, -   21=1,2,3-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic Acid, -   22=1,2,3,4-(¹³C)-2-(²H₃)methyl-2-hydroxy-3-oxobutanoic acid, -   24=1,2,3-(¹³C)-2-(1′-(²H₂),     ¹³C₂)ethyl)-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, -   36=3,4-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, -   37=3,4-(¹³C)-2-(²H₃,¹³C)methyl-2-hydroxy-3-oxobutanoic acid -   40=3,4-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxobutanoic acid, -   45=3,4-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid, -   46=3,4-(¹³C)-2-(²H₅,¹³C₂)ethyl-2-hydroxy-3-oxobutanoic acid, -   49=3,4-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxobutanoic acid.

In a third preferred embodiment of the process of labeling of the invention, the acetolactate derivative is selected in the group of compounds having the following formulae:

-   13=2-hydroxy-2-(²H₂,¹³C)methyl-3-oxo-4-(²H₃)butanoic acid, -   14=2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H₂,¹³C)butanoic acid, -   15=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H₂,¹³C)butanoic acid, -   34=2-(1′-(²H₂),2′-(²H),2′-(¹³C))ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid.

In a fourth preferred embodiment of the process of labeling of the invention, acetolactate derivative is selected in the group of compounds of the following formulae:

-   10=2-hydroxy-2-(²H,¹³C)methyl-3-oxo-4-(²H₃)butanoic acid, -   11=2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H,¹³C)butanoic acid, -   12=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H,¹³C)butanoic acid.

The invention also proposes a compound of the following formula I-1:

wherein:

-   -   X is ¹H or ²H (D),     -   each Y is independently from the others ¹²C or ¹³C,     -   R¹ is a methyl group in which the carbon atom is ¹³C or ¹²C and         the hydrogen atoms are independently from each other ¹H or ²H         (D),     -   R² is either a methyl group in which the carbon atom is ¹³C or         ¹²C and the hydrogen atoms are independently from each other ¹H         or ²H (D), or an ethyl group in which the carbon atoms are         independently from each other ¹³C or ¹²C and the hydrogen atoms         are independently from each other ¹H or ²H (D),         at the provisos that:

1) the hydrogen atoms of the acetolactate derivative of Formula I are not all, at the same time, either ¹H or ²H (D),

2) the hydrogen atoms of R¹ and the hydrogen atoms of R² are not all, at the same time, either ¹H or ²H (D),

3) when all the hydrogen atoms of R¹ are ²H, then the carbon atoms, in formula I-1, are not all, at the same time, ¹²C.

The preferred compounds of the invention are selected from the group consisting of the compounds having the formula I-1 in which:

-   -   R¹ is chosen among the following groups:     -   ¹²CH₃, ¹²CD₃, ¹³CH₃, ¹³CD₃, ¹³CHD₂, ¹³CH₂D,     -   R² is chosen among the following groups:     -   ¹²CH₃, ¹²CD₃, ¹³CH₃, ¹³CD₃, ¹³CHD₂, ¹³CH₂D, ¹²CH₃ ¹²CD₂, ¹²CD₃         ¹²CD₂, ¹³CH₃ ¹²CD₂, ¹³CH₃ ¹³CD₂, ¹³CD₃ ¹³CD₂, ¹³CHD₂ ¹³CD₂,         ¹³CH₂D¹³CD₂, ¹³CHD₂ ¹²CD₂, ¹³CH₂D¹²CD₂,     -   each Y is independently from the others ¹²C or ¹³C,         at the provisos that:

1) the hydrogen atoms of the acetolactate derivative of Formula I are not all, at the same time, either ¹H or ²H (D),

2) the hydrogen atoms of R¹ and the hydrogen atoms of R² are not all, at the same time, either ¹H or ²H (D), and

3) when all the hydrogen atoms of R¹ are ²H, then the carbon atoms, in formula I-1, are not all, at the same time, ¹²C.

It is to be noted that the active part of the acetolactate derivatives of formula I is, in fact the anion of the following formula:

wherein:

X, Y, R¹ and R² are as defined for the compounds of formulae I and I-1.

In a first preferred embodiment, the compound of formula I-1 of the invention is selected among the compounds of the following formulae:

-   4=2-hydroxy-2-(¹³C)methyl-3-oxo-4(²H₃)butanoic acid, -   5=2-hydroxy-2-(²H₃)methyl-3-oxo-4(¹³C)butanoic acid, -   6=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(¹³C)methylbutanoic acid, -   9=1,2,3,4-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxobutanoic acid, -   21=1,2,3-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic Acid, -   22=1,2,3,4-(¹³C)-2-(²H₃)methyl-2-hydroxy-3-oxobutanoic acid, -   24=1,2,3,4-(¹³C)-2-(1′-(²H₂),     ¹³C₂)ethyl)-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, -   36=3,4-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, -   37=3,4-(¹³C)-2-(²H₃,¹³C)methyl-2-hydroxy-3-oxobutanoic acid -   40=3,4-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxobutanoic acid, -   45=3,4-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid, -   46=3,4-(¹³C)-2-(²H₅,¹³C₂)ethyl-2-hydroxy-3-oxobutanoic acid, -   49=3,4,-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxobutanoic acid.

In a second preferred embodiment, the compound of formula I-1 of the invention is selected among the compounds of the following formulae:

-   13=2-hydroxy-2-(²H₂,¹³C)methyl-3-oxo-4-(²H₃)butanoic acid, -   14=2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H₂,¹³C)butanoic acid, -   15=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H₂,¹³C)butanoic acid, -   34=2-(1′-(²H₂),2′-(²H),2′-(¹³C))ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid.

In a third preferred embodiment, the compound of formula I-1 of the invention is selected among the compounds of the following formulae:

-   10=2-hydroxy-2-(²H,¹³C)methyl-3-oxo-4-(²H₃)butanoic acid, -   11=2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H,¹³C)butanoic acid, -   12=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H,¹³C)butanoic acid.

The invention also proposes a process for manufacturing a compound of the following formula I:

comprising the following steps:

a) alkylation with a methyl group in which the carbon atom is ¹³C or ¹²C and the hydrogen atoms are independently from each other ¹H or ²H (D), or an ethyl group in which the carbon atoms are independently from each other ¹³C or ¹²C and the hydrogen atoms are independently from each other ¹H or ²H (D), of a 3-oxobutanoate derivative having its hydroxyl group in position 1 protected by a protecting group, preferably a methyl or ethyl group,

b) hydroxylation of the compound obtained in step a),

c) optionally deprotection and exchange of the desired ¹H atoms by ²H atoms,

wherein step b) of hydroxylation is carried out by using dimethyldioxirane in presence of Nickel (II) ions.

Furthermore, the invention also proposes processes for analysing proteins by NMR.

In a first embodiment, this process comprises a step of labeling the proteins to be analysed by the labeling process of the invention.

In a second embodiment, this process comprises a step of labeling the proteins to be analysed with a compound of the following formula I:

wherein:

-   -   X is ¹H or ²H (D),     -   each Y is independently from the others ¹²C or ¹³C,     -   R¹ is a methyl group in which the carbon atom is ¹³C or ¹²C and         the hydrogen atoms are independently from each other ¹H or ²H         (D),     -   R² is either a methyl group in which the carbon atom is ¹³C or         ¹²C and the hydrogen atoms are independently from each other ¹H         or ²H (D), or an ethyl group in which the carbon atoms are         independently from each other ¹³C or ¹²C and the hydrogen atoms         are independently from each other ¹H or ²H (D),         at the provisos that:

1) the hydrogen atoms of the acetolactate derivative of Formula I are not all, at the same time, either ¹H or ²H (D),

2) the hydrogen atoms of R¹ and the hydrogen atoms of R² are not all, at the same time, either ¹H or ²H (D).

In a third embodiment, this process comprises a step of labeling the proteins to be analysed with a compound of the following formula I-1:

wherein:

-   -   X is ¹H or ²H (D),     -   each Y is independently from the others ¹²C or ¹³C,     -   R¹ is a methyl group in which the carbon atom is ¹³C or ¹²C and         the hydrogen atoms are independently from each other ¹H or ²H         (D),     -   R² is either a methyl group in which the carbon atom is ¹³C or         ¹²C and the hydrogen atoms are independently from each other ¹H         or ²H (D), or an ethyl group in which the carbon atoms are         independently from each other ¹³C or ¹²C and the hydrogen atoms         are independently from each other ¹H or ²H (D),         at the provisos that:

1) the hydrogen atoms of the acetolactate derivative of Formula I are not all, at the same time, either ¹H or ²H (D),

2) the hydrogen atoms of R¹ and the hydrogen atoms of R² are not all, at the same time, either ¹H or ²H (D),

3) when all the hydrogen atoms of R¹ are ²H, then the carbon atoms, in formula I-1, are not all, at the same time, ¹²C.

The invention will be better understood and other features and advantages thereof will be more apparent when reading the following description.

In the description of the invention, the following terms have the following meanings:

-   -   Val designates amino acids Valine,     -   Leu designates amino acids Leucine,     -   Ile designates amino acids Isoleucine,     -   C designates ¹³C,     -   C designates ¹²C,     -   D designates ²H,     -   H designates ¹H,     -   proR, proS: the methyl groups on the γ and β carbons of         respectively unlabeled Leu and Val aminoacids, are not different         and consequently the γ and β carbon of respectively unlabeled         Leu and Val aminoacids are not chiral. But when the groups R¹         and R² are not labeled in the same manner in the acetolactate         derivatives of formula I, the resulting methyl groups on the γ         and β carbon of respectively Leu and Val are differently labeled         and due to this difference the γ and β carbon of respectively         Leu and Val become chiral. These methyl groups are designated as         proR when labeling gives rise to a R configuration and as proS         when labeling gives rise to a S configuration.     -   “Acetolactate derivative” designates compounds of formula I and         their corresponding esters, preferably methyl esters or ethyl         esters.     -   Biomolecular assemblies: molecules containing proteins and other         groups.

The process of labeling of the invention is based on the use of acetolactate derivatives which have methyl or ethyl groups specifically labeled. These acetolactate derivatives are introduced in a medium containing bacteria overexpressing a protein or proteins of interest.

Otherwise stated, the process of the invention is based on the stereospecific rearrangement of labeled or unlabeled alkyl groups in acetolactate derivatives occurring in vivo in the early steps of Leu, Val and Ile biogenesis.

The proposed process of the invention for the specific isotopic labeling of methyl groups of amino acids, which are selected from the group consisting of Valine (Val), Leucine (Leu), and Isoleucine (Ile), in proteins comprises the following step a):

introducing, in a medium containing bacteria overexpressing a protein, a derivative of acetolactate having the following formula:

wherein:

-   -   X is an exchangeable hydrogen being ¹H or ²H (D) depending on         the nature of the solvent, i.e. X is ¹H when the solvent is H₂O         and X is ²H when the solvent is D₂O,     -   each Y is independently from the others ¹²C or ¹³C,     -   R¹ is a methyl group in which the carbon atom is ¹³C or ¹²C and         the hydrogen atoms are independently from each other ¹H or ²H         (D),     -   R² is either a methyl group in which the carbon atom is ¹³C or         ¹²C and the hydrogen atoms are independently from each other ¹H         or ²H (D), or an ethyl group in which the carbon atoms are         independently from each other ¹³C or ¹²C and the hydrogen atoms         are independently from each other ¹H or ²H (D),         at the provisos that:

1) the hydrogen atoms of the acetolactate derivative of Formula I are not all, at the same time, either ¹H or ²H (D),

2) the hydrogen atoms of R¹ and the hydrogen atoms of R² are not all, at the same time, either ¹H or ²H (D).

Then, this process comprises a step of culture of the medium for overexpressing the protein(s) of interest and the purification and isolation of the protein(s).

Preferably, the acetolactate derivatives of formula I are selected in the group consisting of the compounds of the following formulae:

-   -   R¹ is chosen among the following groups:     -   ¹²CH₃, ¹²CD₃, ¹³CH₃, ¹³CD₃, ¹³CHD₂, ¹³CH₂D,     -   R² is chosen among the following groups:     -   ¹²CH₃, ¹²CD₃, ¹³CH₃, ¹³CD₂, ¹³CHD₂, ¹³CH₂D, ¹²CH₃ ¹²CH₂, ¹²CD₃         ¹²CD₂, ¹³CH₃ ¹²CD₂, ¹³CH₃ ¹³CD₂, ¹³CD₃ ¹³CD₂, ¹³CHD₂ ¹³CD₂,         ¹³CH₂D¹³CD₂, ¹³CHD₂ ¹²CD₂, ¹³CH₂D¹²CD₂,     -   each Y is independently from the others ¹²C or ¹³C,         at the provisos that:

1) the hydrogen atoms of the acetolactate derivative of Formula I are not all, at the same time, either ¹H or ²H (D),

2) the hydrogen atoms of R¹ and the hydrogen atoms of R² are not all, at the same time, either ¹H or ²H (D).

These acetolactate derivatives have the following formulae 1-57 in which:

-   -   (¹³C) methyl means that the carbon atoms of the methyl groups is         ¹³C,     -   (¹³C₂) ethyl means that the two carbon atoms of the ethyl groups         are ¹³C,     -   ²H_(n), in which n=1, 2, 3, 4 or 5 means that the n hydrogen         atoms are ²H, and     -   U means that all the carbon atoms are ¹³C:

-   1=2-hydroxy-2-methyl-3-oxo-4-(²H₃)butanoic acid,

-   2=2-hydroxy-2-(²H₃)methyl-3-oxobutanoic acid,

-   3=2-(²H₅)ethyl-2-hydroxy-3-oxobutanoic acid,

-   4=2-hydroxy-2-(¹³C)methyl-3-oxo-4(²H₃)butanoic acid,

-   5=2-hydroxy-2-(²H₃)methyl-3-oxo-4(¹³C)butanoic acid,

-   6=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(¹³C)methylbutanoic acid,

-   7=U-(¹³C)-2-hydroxy-2-methyl-3-oxo-4(²H₃)butanoic acid,

-   8=U-(¹³C)-2-hydroxy-2-(²H₃)methyl-3-oxobutanoic acid,

-   9=1,2,3,4-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxobutanoic acid,

-   10=2-hydroxy-2-(²H,¹³C)methyl-3-oxo-4-(²H₃)butanoic acid,

-   11=2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H,¹³C)butanoic acid,

-   12=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H,¹³C)butanoic acid,

-   13=2-hydroxy-2-(²H₂,¹³C)methyl-3-oxo-4-(²H₃)butanoic acid,

-   14=2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H₂,¹³C)butanoic acid,

-   15=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H₂,¹³C)butanoic acid,

-   16=U-(¹³C)-2-hydroxy-2-(²H₂)methyl-3-oxo-4-(²H₃)butanoic acid,

-   17=U-(¹³C)-2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H₂)butanoic acid,

-   18=1,2,3,4-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H₂)butanoic acid,

-   19=2-(1′-(²H₂)ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid,

-   20=2-((1′-(²H₂),2′-(¹³C))ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid,

-   21=1,2,3-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid,

-   22=1,2,3,4-(¹³C)-2-(²H₃)methyl-2-hydroxy-3-oxobutanoic acid,

-   23=U-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxobutanoic acid,

-   24=1,2,3-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl)-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid,

-   25=U-(¹³C)-2-(1′-(²H₂))ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid,

-   26=U-(¹³C)-2-(²H)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid,

-   27=U-(¹³C)-2-(²H₃)methyl-2-hydroxy-3-oxo-4-(²H)butanoic acid,

-   28=U-(¹³C)-2-((²H₅)ethyl)-2-hydroxy-3-oxo-4-(²H)butanoic acid,

-   29=1,2,3-(¹³C)-2-(²H,¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid,

-   30=1,2,3,4-(¹³C)-2-(²H₃)methyl-2-hydroxy-3-oxo-4-(²H)butanoic acid,

-   31=1,2,3,4-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H)butanoic acid,

-   32=U-(¹³C)-2-(1′-(²H₂),2′-(²H))ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid,

-   33=1,2,3-(¹³C)-2-(2′-(²H),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid,

-   34=2-(1′-(²H₂),2′-(²H),2′-(¹³C))ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid,

-   35=1,2,3-(¹³C)-2-(²H₂,¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid,

-   36=3,4-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid,

-   37=3,4-(¹³C)-2-(²H₃,¹³C)methyl-2-hydroxy-3-oxobutanoic acid,

-   38=3,4-(¹³C)-2-(²H₂,¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid,

-   39=3,4-(¹³C)-2-(²H,¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid,

-   40=3,4-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxobutanoic acid,

-   41=3,4-(¹³C)-2-(²H,¹³C)methyl-2-hydroxy-3-oxo-4-(²H)butanoic acid,

-   42=3,4-(¹³C)-2-(²H₂,¹³C)methyl-2-hydroxy-3-oxo-4-(²H₂)butanoic acid,

-   43=3,4-(¹³C)-2-(²H₃,¹³C)methyl-2-hydroxy-3-oxo-4-(²H₂)butanoic acid,

-   44=3,4-(¹³C)-2-(²H₃,¹³C)methyl-2-hydroxy-3-oxo-4-(²H)butanoic acid,

-   45=3,4-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid,

-   46=3,4-(¹³C)-2-(²H₅,¹³C₂)ethyl-2-hydroxy-3-oxobutanoic acid,

-   47=3,4-(¹³C)-2-(1′-(²H₂),2′-(²H),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid,

-   48=3,4-(¹³C)-2-(1′-(²H₂),2′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid,

-   49=3,4-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxobutanoic acid,

-   50=3,4-(¹³C)-2-(1′-(²H₂),2′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₂)butanoic     acid,

-   51=3,4-(¹³C)-2-(1′-(²H₂)-2′-(²H),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H)butanoic     acid,

-   52=3,4-(¹³C)-2-(2H₅,¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₂)butanoic acid,

-   53=3,4-(¹³C)-2-(2H₅,¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H)butanoic acid,

-   54=U-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H₂)butanoic acid,

-   55=U-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H)butanoic acid,

-   56=1,2,3-(¹³C)-2-(1′-(²H₂),2′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid,

-   57=U-(¹³C)-2-(1′-(²H₂),2′-(²H₂))ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid.

In order to better illustrate the formulae of the compounds of the invention, the developed formulae of compounds 1-18 are given below.

In these formulae, X designates ²H or ¹H, D designates ²H, C designates ¹³C and C designates ¹²C:

More specifically, when the compound of formula 1 is used in the process of the invention, the protonated methyl groups are proS methyl groups of Valine (γ2 methyl) and proS methyl groups of Leucine (δ2 methyl) and are ¹²CH₃.

Thus this compound enables to perform structural studies of proteins containing Leu and Val amino acids. The labeling with this compound enables to determine the stereospecificity of the amino acids Val and Leu and also enables to detect weak as low as 0.05 Hz, dipolar and scalar interactions in proteins of less than 30 kDa.

This compound of formula 1 (2-hydroxy-2-methyl-3-oxo-4-(²H₃)-butanoic acid) enables to label amino acids Leu and Val incorporated in proteins in the following manner:

When this compound of formula 1 is used to produce human ubiquitine and protein G of streptoccocus (GB3) using E. coli cells in a deuterated M9 medium using 300 mg/l of this compound of formula 1, only the proS methyl groups of Val and Leu are protonated in these proteins and are then observable in the NMR 1D spectra of the obtained compounds.

The compound of formula 2 (2-hydroxy-2-(²H₃)methyl-3-oxobutanoic acid) enables to stereospecifically label with ¹²CH₃ proR methyl groups of Leu and Val. This compound enables to perform structural studies, stereospecific attribution and detection of weak dipolar and scalar interactions by NMR in proteins of less than 30 kDa.

The stereospecific labeling of these methyl groups with compound 2 gives the following labeling of amino acids Leu and Val:

In NMR 1D spectra of proteins obtained by culture of a medium containing this compound of formula 2, only proR methyl groups of Leucine and Valine, more precisely the γ1 methyl group of Valine and the δ1 methyl group of Leucine are observable.

The compound of formula 3 enables to specifically label with ¹²CH₃ γ2 methyl groups of Ile. This compound enables to perform structural studies, specific attribution and detection of weak dipolar and scalar interactions by NMR in proteins of less than 30 kDa.

The labeling of the γ2 methyl groups of Ile with the compound of formula 3 (2-(²H₅)ethyl-2-hydroxy-3-oxobutanoic acid) is shown in the following schema:

It enables to perform structural studies, by liquid NMR, of proteins and biomolecular assemblies.

The compound of formula 4 enables to stereospecifically label with ¹³CH₃ proS methyl groups of Valine (γ2 methyl) and proS methyl groups of Leucine (δ2 methyl). It enables to perform structural studies by liquid NMR of proteins and large biomolecular assemblies.

The stereospecific labeling of proS methyl groups of Valine and Valine with the compound of formula 4 (2-hydroxy-2-(¹³C)methyl-3-oxo-4(²H₃)butanoic acid) is shown to the following schema:

When the compound of formula 4 is used for stereospecifically labeling TET2 (U-[²H], U-[¹²C], Leu/Val-[¹³C¹H3]^(proS) TET2) a dodecameric protease of 468 kDa and malate synthase G (MSG) (U-[²H], U-[¹²C], Leu/Val-[¹³C¹H3]^(proS) MSG) produced using E. coli cells in a deuterated M9 medium in presence of 300 mg/L of the compound of formula 4, the methyl-TROSY spectra of the obtained compounds show that only half of the resonances of the methyl groups are observable.

The compound of formula 5 (2-hydroxy-2-(²H₃)methyl-3-oxo-4(¹³C)butanoic acid) enables to stereospecifically label with ¹³CH₃ proR methyl group of the Valines (γ1 methyl) and proR methyl group of the Leucines (δ1 methyl).

It enables to perform structural studies by liquid NMR of proteins and large biomolecular assemblies.

The labeling with this compound of formula 5 leads to labeled amino acids Leucine and Valine as shown to the following schema:

When this compound of formula 5 is used for obtaining TET2 (U-[²H], U-[¹²C], Leu/Val-[¹³C¹H₃]^(proR) TET2), a dodecameric protease of 468 kDa produced using E. coli cells in a deuterated M9 medium in presence of 300 mg/L of the compound of formula 5, the methyl-TROSY spectra of the obtained compound shows only half of the resonance of the methyl groups as compared to the spectra of TET2 produced from isovalerate which is protonated on the two methyl groups.

The compound of formula 6 (2-(²H₅)ethyl-2-hydroxy-3-oxo-4(¹³C)methylbutanoic acid) enables to label with ¹³CH_(3 γ2) methyl groups of Isoleucine.

It enables to perform structural studies by liquid NMR of proteins and large biomolecular assemblies.

This compound leads to (L)-Isoleucine labeled as shown in the following schema.

The methyl-TROSY spectra of TET2 (U-[²H], U-[¹²C], Ile-[¹³C¹H₃]^(γ2) TET2), a dodecameric protease of 468 kDa produced in a deuterated M9 medium in presence of 300 mg/L of the compound of formula 6 shows only γ2 methyl groups of Isoleucine.

The compound of formula 7 (U-(¹³C)-2-hydroxy-2-methyl-3-oxo-4(²H₃)butanoic acid) enables to stereoselectively assign the proS methyl groups of Valine (γ2 methyl) and proS methyl groups of Leucine (δ2 methyl) in proteins and large biomolecular assemblies by liquid NMR.

This compound leads to the following labeling of Leucine and Valine:

The compound of formula 8 (U-(¹³C)-2-hydroxy-2-(²H₃)methyl-3-oxobutanoic acid) enables stereo selectively assign the proR methyl groups of Valine (γ1 methyl) and proR methyl groups of Leucine (δ1 methyl) in proteins and large biomolecular assemblies by liquid NMR.

The compound of formula 8 gives the following labeling of Leucine and Valine amino acids in proteins and biomolecular assemblies:

The compound of formula 9 (1,2,3,4-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxobutanoic acid) enables to assign the γ2 methyl groups of Isoleucine in proteins and biomolecular assemblies by liquid NMR.

Isoleucine is labeled as shown below:

The compound of formula 10 (2-hydroxy-2-(²H,¹³C)methyl-3-oxo-4-(²H₃)butanoic acid) enables to stereospecifically label with ¹³CDH₂ proS methyl groups of Valine (γ2 methyl) and proS methyl groups of Leucine (δ2 methyl).

It can be used for ²H dynamic studies of proteins by liquid NMR.

The schema below shows how the amino acids Leucine and Valine are labeled with this compound of formula 10.

The compound of formula 11 (2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H,¹³C)butanoic acid) enables to stereospecifically label with ¹³CDH₂ proR methyl groups of Valine (γ1 methyl) and proR methyl groups of Leucine (δ1 methyl) . . .

It can be used for ²H dynamic studies of proteins by liquid NMR. This compound labels the amino acids Leucine and Valine in the manner shown in the following schema:

The compound of formula 12 (2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H,¹³C)butanoic acid) enables to specifically label with ¹³CDH₂ γ2 methyl groups of Isoleucine.

It can be used for ²H dynamic studies of proteins by liquid NMR.

The compound of formula 12 label (L)-Isoleucine as shown in the following schema:

The compound of formula 13 (2-hydroxy-2-(²H₂,¹³C)methyl-3-oxo-4-(²H₃)butanoic acid) enables to stereospecifically label with ¹³CHD₂ proS methyl groups of Valine (γ2 methyl) and proS methyl groups of Leucine (δ2 methyl).

It can be used for solid state NMR of proteins and dynamic studies of proteins by liquid NMR.

The following schema shows how compound 13 labels Leucine and Valine in proteins:

The compound of formula 14 (2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H₂,¹³C)butanoic acid) enables to stereospecifically label with ¹³CHD₂ proR methyl groups of Valine (γ1 methyl) and proR methyl groups of Leucine (δ1 methyl).

It can be used for solid state NMR of proteins and for dynamic studies of proteins by liquid NMR.

This compound labels Leucine and Valine as shown in the following schema:

The compound of formula 15 (2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H₂,¹³C)butanoic acid) enables to label with ¹³CHD₂ specifically γ2 methyl groups of Isoleucine in proteins.

It can be used for solid state NMR of proteins and for dynamic studies of proteins by liquid NMR.

This compound labels Isoleucine as shown in the following schema:

The compound of formula 16 (U-(¹³C)-2-hydroxy-2-(²H₂)methyl-3-oxo-4-(²H₃)butanoic acid) can be used for solid state NMR of proteins.

It allows to label the carbon chain of Valine and Leucine and with ¹³CHD₂ the proS methyl groups of Valine (γ2 methyl) and proS methyl groups of Leucine (δ2 methyl) in proteins as shown in the following schema:

The compound of formula 17 (U-(¹³C)-2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H₂)butanoic acid) can be used for solid state NMR of proteins.

It enables to label the carbon chain of Leucine and Valine and with ¹³CHD₂ the proR methyl groups of Valine (γ1 methyl) and proR methyl groups of Leucine (δ1 methyl) in protein as shown in the following schema:

The compound of formula 18 (1,2,3,4-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H₂)butanoic acid) can be used for solid state NMR of proteins

It enables to label the carbon chain and with ¹³CHD₂ the γ2 methyl groups of Isoleucine in proteins as shown in the following schema:

To summarize, among the acetolactate derivatives of formula I:

1) those more appropriate for assignment of NMR signals and/or measurements of structural restraints have the following formulae:

-   4=2-hydroxy-2-(¹³C)methyl-3-oxo-4(²H₃)butanoic acid, -   5=2-hydroxy-2-(²H₃)methyl-3-oxo-4(¹³C)butanoic acid, -   6=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(¹³C)methylbutanoic acid, -   9=1,2,3,4-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxobutanoic acid, -   21=1,2,3-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic Acid, -   22=1,2,3,4-(¹³C)-2-(²H₃)methyl-2-hydroxy-3-oxobutanoic acid, -   24=1,2,     3-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl)-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid, -   36=3,4-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, -   37=3,4-(¹³C)-2-(²H₃,¹³C)methyl-2-hydroxy-3-oxobutanoic acid -   40=3,4-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxobutanoic acid, -   45=3,4-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid, -   46=3,4-(¹³C)-2-(²H₅,¹³C₂)ethyl-2-hydroxy-3-oxobutanoic acid, -   49=3,4-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxobutanoic acid.

2) those which are preferably used for dynamics studies by liquid state NMR and for use in solid state NMR are the compounds of the following formulae:

-   13=2-hydroxy-2-(²H₂,¹³C)methyl-3-oxo-4-(²H₃)butanoic acid, -   14=2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H₂,¹³C)butanoic acid, -   15=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H₂,¹³C)butanoic acid, -   34=2-(1′-(²H₂),2′-(²H),2′-(¹³C))ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid, and

3) for dynamics studies by ²H NMR, preferred acetolactate derivatives of formula I have the following formulae:

-   10=2-hydroxy-2-(²H,¹³C)methyl-3-oxo-4-(²H₃)butanoic acid, -   11=2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H,¹³C)butanoic acid, -   12=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H,¹³C)butanoic acid.

The labeling process of the invention was applied to malate synthase G (MSG), the largest monomeric protein (82 kDa) for which a 3 D structure has been determined by NMR spectroscopy, using compound of formula 4.

The methyl-TROSY spectra of specifically protonated MSG over expressed in E. Coli in the presence of the compound 4 of the invention have been compared with those of MSG over expressed in E. Coli in the presence of 2-oxo-3-(²H₃)-3-(²H)-4-(¹³C)-ketoisovalerate.

A direct comparison of these spectra shows that replacing α-ketoisovalerate with compound of the invention increases the sensitivity of NMR spectra by a factor of 1.6 (theoretically 2) and decreases the number of peaks by two.

Only signals for Val-γ₂ and Leu-δ₂ methyl groups can be observed when the protein is prepared using synthetic 2-(¹³C)methyl-4-(²H₃)-acetolactate(compound of formula 4).

The absence of signals for Val-γ₁ and Leu-δ₁ methyl groups confirms that the ¹³C¹H₃ methyl substituent in position 2 of 2-(S)-acetolactate is stereospecifically transfered in-vivo to position proS of 2,3-dihydroxy-isovalerate by ketol-isomerase and that no methyl interconversion occurs in the following steps of the Leu/Val biosynthetic pathway.

Furthermore, no signals for CH₂D and CD₂H isotopomers were detected demonstrating that H/D exchange does not occur after the introduction of acetolactate to M9/D₂O culture medium. Lastly, no scrambling to other methine, methylene or methyl sites was detected, indicating that the excess of acetolactate in-vivo does not interfere with other metabolic pathways.

Thus, the use of specifically-labeled acetolactate is an efficient method to realize a complete and stereospecific methyl-labeling of Leu and Val side chains in a fully perdeuderated protein background without detectable scrambling.

As the acetolactate derivatives are not incorporated into other metabolic pathways, they can be conveniently used in combination with other precursors such as those proposed for specific labeling of Ile-[CH₃]^(δ1), or Ala-[CH₃]^(β) methyl groups.

An obvious first application of this new labeling scheme is the stereospecific assignment of methyl groups of Leu, Val and Ile.

Despite the existence of efficient methods to connect methyl resonances to sequentially-assigned backbone nuclei, the stereospecific assignment of prochiral methyl groups remains difficult. Previous approaches have involved either a fractional [¹³C]-labeling strategy or the measurement of small scalar couplings.

While these approaches have been shown to be useful for small and medium size proteins, their application to larger proteins is challenging. In comparison, stereospecific assignment of Leu, Val, and Ile methyl groups can be obtained directly by visually inspecting 2D methyl-TROSY spectra recorded on specifically protonated proteins produced with compound of formula 4 according to the invention.

This strategy is directly applicable to large proteins as demonstrated for MSG (82 kDa) for which 98% of Leu/Val methyl groups were unambiguously assigned stereospecifically.

Specific protonation of methyl groups is a powerful method to extract long-range distance restraints in proteins.

For large proteins the extraction of inter-methyl NOE is generally hampered by the low intrinsic resolution of ¹³C-edited 4D NOESY spectra.

Simplification of spectra using the stereospecific labeling of prochiral methyl groups according to the process of the invention significantly reduces ambiguity in the analysis of Nuclear Over Hauser effect (NOE) cross peaks.

NOEs between proS and proR methyl groups are no longer detected using stereospecifically-labeled samples. However, the use of the compound 4 of the invention in place of 2-oxo-3-(²H₃)-3-(²H)-4-(¹³C)-ketoisovalerate increases the protonation level of proS methyl groups theoretically two-fold, which in turn enhances the intensity of the remaining NOE cross-peaks by a factor of theoretically 4.

This gain in sensitivity leads to the detection of new structurally meaningful long-range NOE cross-peaks between more remote proS methyl groups, thereby compensating for the loss of NOEs involving proR methyl groups.

A comparison of distance restraints extracted from 4D NOESY spectra revealed that stereospecific labeling of Leu/Val methyl groups increased the distance threshold for which an NOE can be detected by ˜20%. Prochiral-specific labeling of methyl groups is not only an attractive way to simplify the time-consuming step of NOE analysis, it allows a significant extension of the range of structurally meaningful ¹H-¹H distances that can be measured in large proteins.

The process of this invention can also be used for the specific isotopic labeling of amino acids selected from the group consisting of Val, Leu and Ile, and more particularly for the specific labeling of methyl groups of Leu and Val as well as specific labeling of the γ2-methyl groups of isoleucine.

Another object of the invention is acetolactate derivatives of the following formula I-1:

wherein:

-   -   X is an exchangeable hydrogen being ¹H or ²H (D) depending on         the nature of the solvent,     -   each Y is independently from the others ¹²C or ¹³C,     -   R¹ is a methyl group in which the carbon atom is ¹³C or ¹²C and         the hydrogen atoms are independently from each other ¹H or ²H         (D),     -   R² is either a methyl group in which the carbon atom is ¹³C or         ¹²C and the hydrogen atoms are independently from each other ¹H         or ²H (D), or an ethyl group in which the carbon atoms are         independently from each other ¹³C or ¹²C and the hydrogen atoms         are independently from each other ¹H or ²H (D),         at the provisos that:

1) the hydrogen atoms of the acetolactate derivative of Formula I are not all, at the same time, either ¹H or ²H (D),

2) the hydrogen atoms of R¹ and the hydrogen atoms of R² are not all, at the same time, either ¹H or ²H (D),

3) when all the hydrogen atoms of R¹ are ²H, then the carbon atoms, in formula I-1, are not all, at the same time, ¹²C.

Preferred compounds of the invention have the formula I-1 in which:

-   -   R¹ is chosen among the following groups:     -   ¹²CH₃, ¹²CD₃, ¹³CH₃, ¹³CD₃, ¹³CHD₂, ¹³CH₂D,     -   R² is chosen among the following groups:     -   ¹²CH₃, ¹²CD₃, ¹³CH₃, ¹³CD₃, ¹³CHD₂, ¹³CH₂D, ¹²CH₃ ¹²CD₂, ¹²CD₃         ¹²CD₂, ¹³CH₃ ¹²CD₂, ¹³CH₃ ¹³CD₂, ¹³CD₃ ¹³CD₂, ¹³CHD₂ ¹³CD₂,         ¹³CH₂D¹³CD₂, ¹³CHD₂ ¹²CD₂, ¹³CH₂D¹²CD₂,     -   each Y is independently from the others ¹²C or ¹³C,         at the provisos that:

1) the hydrogen atoms of the acetolactate derivative of Formula I are not all, at the same time, either ¹H or ²H (D),

2) the hydrogen atoms of R¹ and the hydrogen atoms of R² are not all, at the same time, either ¹H or ²H (D), and

3) when all the hydrogen atoms of R¹ are ²H, then the carbon atoms, in formula I-1 are not all, at the same time, ¹²C.

Preferred compounds of the invention are the compounds of following formulae 2-57:

-   2=2-hydroxy-2-(²H₃)methyl-3-oxobutanoic acid, -   3=2-(²H₅)ethyl-2-hydroxy-3-oxobutanoic acid, -   4=2-hydroxy-2-(¹³C)methyl-3-oxo-4(²H₃)butanoic acid, -   5=2-hydroxy-2-(²H₃)methyl-3-oxo-4(¹³C)butanoic acid, -   6=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(¹³C)methylbutanoic acid, -   7=U-(¹³C)-2-hydroxy-2-methyl-3-oxo-4(²H₃)butanoic acid, -   8=U-(¹³C)-2-hydroxy-2-(²H₃)methyl-3-oxobutanoic acid, -   9=1,2,3,4-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxobutanoic acid, -   10=2-hydroxy-2-(²H,¹³C)methyl-3-oxo-4-(²H₃)butanoic acid, -   11=2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H,¹³C)butanoic acid, -   12=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H,¹³C)butanoic acid, -   13=2-hydroxy-2-(²H₂,¹³C)methyl-3-oxo-4-(²H₃)butanoic acid, -   14=2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H₂,¹³C)butanoic acid, -   15=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H₂,¹³C)butanoic acid, -   16=U-(¹³C)-2-hydroxy-2-(²H₂)methyl-3-oxo-4-(²H₃)butanoic acid, -   17=U-(¹³C)-2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H₂)butanoic acid, -   18=1,2,3,4-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H₂)butanoic acid, -   19=2-(1′-(²H₂)ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, -   20=2-((1′-(²H₂),2′-(¹³C))ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, -   21=1,2,3-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, -   22=1,2,3,4-(¹³C)-2-(²H₃)methyl-2-hydroxy-3-oxobutanoic acid, -   23=U-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxobutanoic acid, -   24=1,2,3-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl)-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid, -   25=U-(¹³C)-2-(1′-(²H₂))ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, -   26=U-(¹³C)-2-(²H)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, -   27=U-(¹³C)-2-(²H₃)methyl-2-hydroxy-3-oxo-4-(²H)butanoic acid, -   28=U-(¹³C)-2-((²H₅)ethyl)-2-hydroxy-3-oxo-4-(²H)butanoic acid, -   29=1,2,3-(¹³C)-2-(²H,¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid, -   30=1,2,3,4-(¹³C)-2-(²H₃)methyl-2-hydroxy-3-oxo-4-(²H)butanoic acid, -   31=1,2,3,4-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²1H)butanoic acid, -   32=U-(¹³C)-2-(1′-(²H₂),2′-(²H))ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid, -   33=1,2,3-(¹³C)-2-(2′-(²H),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid, -   34=2-(1′-(²H₂),2′-(²H),2′-(¹³C))ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid, -   35=1,2,3-(¹³C)-2-(²H₂,¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid, -   36=3,4-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, -   37=3,4-(¹³C)-2-(²H₃,¹³C)methyl-2-hydroxy-3-oxobutanoic acid, -   38=3,4-(¹³C)-2-(²H₂,¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, -   39=3,4-(¹³C)-2-(²H,¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, -   40=3,4-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxobutanoic acid, -   41=3,4-(¹³C)-2-(²H,¹³C)methyl-2-hydroxy-3-oxo-4-(²H)butanoic acid, -   42=3,4-(¹³C)-2-(²H₂,¹³C)methyl-2-hydroxy-3-oxo-4-(²H₂)butanoic acid, -   43=3,4-(¹³C)-2-(²H₃,¹³C)methyl-2-hydroxy-3-oxo-4-(²H₂)butanoic acid, -   44=3,4-(¹³C)-2-(²H₃,¹³C)methyl-2-hydroxy-3-oxo-4-(²H)butanoic acid, -   45=3,4-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid, -   46=3,4-(¹³C)-2-(²H₅,¹³C₂)ethyl-2-hydroxy-3-oxobutanoic acid, -   47=3,4-(¹³C)-2-(1′-(²H₂),2′-(²H),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid, -   48=3,4-(¹³C)-2-(1′-(²H₂),2′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid, -   49=3,4-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxobutanoic acid, -   50=3,4-(¹³C)-2-(1′-(²H₂),2′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₂)butanoic     acid, -   51=3,4-(¹³C)-2-(1′-(²H₂)-2′-(²H),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H)butanoic     acid, -   52=3,4-(¹³C)-2-(2H₅,¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₂)butanoic acid, -   53=3,4-(¹³C)-2-(2H₅,¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H)butanoic acid, -   54=U-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H₂)butanoic acid, -   55=U-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H)butanoic acid, -   56=1,2,3-(¹³C)-2-(1′-(²H₂),2′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid, -   57=U-(¹³C)-2-(1′-(²H₂),2′-(²H₂))ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid.

Among these compounds, the compounds of the following formulae are preferred:

-   4=2-hydroxy-2-(¹³C)methyl-3-oxo-4(²H₃)butanoic acid, -   5=2-hydroxy-2-(²H₃)methyl-3-oxo-4(¹³C)butanoic acid, -   6=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(¹³C)methylbutanoic acid, -   9=1,2,3,4-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxobutanoic acid, -   21=1,2,3-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic Acid, -   22=1,2,3,4-(¹³C)-2-(²H₃)methyl-2-hydroxy-3-oxobutanoic acid, -   24=1,2,3-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl)-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid, -   36=3,4-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, -   37=3,4-(¹³C)-2-(²H₃,¹³C)methyl-2-hydroxy-3-oxobutanoic acid -   40=3,4-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxobutanoic acid, -   45=3,4-(¹³C)-2-(1′-(²H₂),¹³ C₂)ethyl-2     -hydroxy-3-oxo-4-(²H₃)butanoic acid, -   46=3,4-(¹³C)-2-(²H₅,¹³C₂)ethyl-2-hydroxy-3-oxobutanoic acid, -   49=3,4-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxobutanoic acid.

But still other preferred compounds have the following formulae:

-   13=2-hydroxy-2-(²H₂,¹³C)methyl-3-oxo-4-(²H₃)butanoic acid, -   14=2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H₂,¹³C)butanoic acid, -   15=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H₂,¹³C)butanoic acid, -   34=2-(1′-(²H₂),2′-(²H),2′-(¹³C))ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic     acid.

Other preferred acetolactate derivatives according to the invention have the following formulae:

-   10=2-hydroxy-2-(²H,¹³C)methyl-3-oxo-4-(²H₃)butanoic acid, -   11=2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H,¹³C)butanoic acid, -   12=2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H,¹³C)butanoic acid.

The invention also proposes a process for manufacturing a compound of formula I, more particularly of formula 1-18, which comprises the following steps:

a) alkylation, with a methyl group in which the carbon atom is ¹³C or ¹²C and the hydrogen atoms are independently from each other ¹H or ²H (D), or an ethyl group in which the carbon atoms are independently from each other ¹³C or ¹²C and the hydrogen atoms are independently from each other ¹H or ²H (D), of 3-oxobutanoate having its hydroxyl group in position 1 protected by a protecting group, preferably a methyl or ethyl group,

b) hydroxylation of the compound obtained in step a),

c) optionally deprotection and exchange of the desired ¹H atoms by ²H atoms, wherein step b) of hydroxylation is carried out by using dimethyldioxirane in presence of Nickel (II) ions.

Step b) of hydroxylation according to the process of the invention, enables to improve the total yield of the process to more than 90% whereas when this hydroxydation step is carried out using molecular oxygen in presence of Cobalt or Cerium ions as a catalyst, as suggested by the prior art, the total yield of the process is only of about 50% without any other steps are modified.

The labeling process of the invention is particularly appropriate for analysing proteins, including those included in biomolecular assemblies by NMR.

Thus, the invention also proposes processes for analysing proteins by NMR.

In a first embodiment, this process comprises a step of labeling of proteins to be analysed by the process of labeling of the invention.

In a second embodiment, this process comprises a step of labeling of proteins to be analysed with a compound if the following formula I:

wherein:

-   -   X is ¹H or ²H (D),     -   each Y is independently from the others ¹²C or ¹³C,     -   R¹ is a methyl group in which the carbon atom is ¹³C or ¹²C and         the hydrogen atoms are independently from each other ¹H or ²H         (D),     -   R² is either a methyl group in which the carbon atom is ¹³C or         ¹²C and the hydrogen atoms are independently from each other ¹H         or ²H (D), or an ethyl group in which the carbon atoms are         independently from each other ¹³C or ¹²C and the hydrogen atoms         are independently from each other ¹H or ²H (D),         at the provisos that:

1) the hydrogen atoms of the acetolactate derivative of Formula I are not all, at the same time, either ¹H or ²H (D),

2) the hydrogen atoms of R¹ and the hydrogen atoms of R² are not all, at the same time, either ¹H or ²H (D).

In a third embodiment, this process comprises a step of labeling of proteins to be analysed with a compound if the following formula I-1:

wherein:

-   -   X is ¹H or ²H (D),     -   each Y is independently from the others ¹²C or ¹³C,     -   R¹ is a methyl group in which the carbon atom is ¹³C or ¹²C and         the hydrogen atoms are independently from each other ¹H or ²H         (D),     -   R² is either a methyl group in which the carbon atom is ¹³C or         ¹²C and the hydrogen atoms are independently from each other ¹H         or ²H (D), or an ethyl group in which the carbon atoms are         independently from each other ¹³ C or ¹²C and the hydrogen atoms         are independently from each other ¹H or ²H (D),         at the provisos that:

1) the hydrogen atoms of the acetolactate derivative of Formula I are not all, at the same time, either ¹H or ²H (D),

2) the hydrogen atoms of R¹ and the hydrogen atoms of R² are not all, at the same time, either ¹H or ²H (D),

3) when all the hydrogen atoms of R¹ are ²H, then the carbon atoms, in formula I-1, are not all, at the same time, ¹²C.

In order that the invention be better understood, an example of carrying out the labeling process of the invention is given below. This example is only illustrative, and in no way limitative of the invention.

1. Synthesis of Selectively Methyl-labeled Acetolactate

Ethyl 2-(¹³C)methyl-3-oxobutanoate

A mixture of 12.15 mL (95.4 mmol) of ethyl 3-oxobutanoate (A), 14.5 g (104.9 mmol) K₂CO₃ and 15.0 g (104.9 mmol) ¹³C-methyl iodide (Cambridge Isotope Laboratories, Inc.) in 120 mL of absolute ethanol was heated at 40° C. under argon for 90 h. After filtration, the filtrate was concentrated in vacuo to afford 12.30 g (yield 89%) of product, which was sufficiently pure to be used without further purification. ¹H NMR (CDCl₃); 4.21 (q, J=7.1 Hz, 2H), 3.51 (dq, J=7.2, 4.4 Hz, 1H), 2.25 (s, 3H), 1.35(dd, J=130.4, 7.2 Hz, 3H), 1.29 (t, J=7.1 Hz, 3H).

Ethyl 2-hydroxy-2-(¹³C)methyl-3-oxobutanoate (B)

Hydroxylation reaction was carried out by freshly prepared dimethyldioxirane in presence of Nickel(II) ions. To a mixture of 50 mg (0.345 mmol) of ethyl 2-(¹³C)methyl-3-oxobutanoate in 3 mL of distilled water, was added successively 8.6 mg (0.035 mmole) of Ni(OAc)₂.4H₂O and, at 0° C., 20 mL of an untitrated solution of dimethyldioxirane (0.05-0.10 M) in acetone. The resulting solution was allowed to warm to room temperature and stirred for 24 hours. The organic solvant was then evaporated in vacuo and the resulting aqueous residue was extracted with dichloromethane (four times). The organic extract was dried over Na₂SO₄ and concentrated in vacuo to afford 51 mg (0.317 mmol; 92% yield; >90% conversion) of ethyl 2-hydroxy-2-(¹³C)methyl-3-oxobutanoate as a colorless liquid which was pure enough to be used without further purification.

NMR spectroscopy: ¹H NMR (CDCl₃); 4.25(q, J=7.1 Hz, 2H), 2.27 (s, 3H), 1.58 (d, ¹J_(H-13C)=130.3 Hz, 3H), 1.29 (t, J=7.1 Hz, 3H).

2-hydroxy-2-(¹³C)methyl-3-oxo-4-(²H₃)-butanoate (or 2-(¹³C)methyl-4-(²H₃)-acetolactate) (Compound of Formula 4)

Deprotection and exchange of the protons of the methyl group in position 4 of ethyl 2-hydroxy-2-(¹³C)methyl-3-oxobutanoate (B) were achieved in D₂O at pH 13. Typically, 300 mg of B was added to 24 mL of a 0.1 M NaOD/D₂O solution. The deprotection was immediate as observed by NMR spectroscopy. The completion of the exchange on the 4-methyl was also followed in real time by NMR spectrometry. 97±1% of protons of terminal methyl groups have been exchanged after 30 min while the methyl subsistent in position 2 remains protonated at a level of 98±1%. As the deprotection reaction consumes hydroxide ions, the pH and consequently the deuterium exchange rate decreases during the reaction. Once the exchange was complete, the solution was adjusted to neutral pH with DCl and 2 mL of 1 M TRIS pH 8 in D₂O was added. The solution of the compound of formula 4 was then stored at −20° C. until required.

Reaction scheme of the protocol for the production of U-[²H], U-[¹⁵N], Leu/Val-[¹³C¹H₃]^(proS) labeled proteins.

Detailed protocol for the chemical synthesis of 2-(¹³C)methyl-4-(²H₃)-acetolactate is presented above. ¹³C nuclei are displayed in italic bold. The stereochemistry, following the incorporation of ¹³C¹H₃ group into acetolactate, in the different intermediates of Leu/Val biogenesis pathway is indicated on the figure (assuming growth in M9/D₂O based culture medium). Each biosynthetic intermediate has been named according to the Kyoto Encyclopedia of Genes and Genomes. The enzymes responsible for catalyzing reaction are indicated by EC number. EC 1.1.1.86: ketol-acid reductoisomerase; EC 4.2.1.9: dihydroxy-acid dehydratase; EC 2.6.1.42: branched-chain amino acid aminotransferase; EC 2.3.3.13: 2-isopropylmalate synthase; EC 4.2.1.33: 3-isopropylmalate dehydratase; EC 1.1.1.85: 3-isopropylmalate dehydrogenase. Further information on the Leu/Val metabolic pathway can be found online: http://www.genome.jp/kegg/.

2. Overexpression of methyl stereospecifically labeled proteins in E. coli.

Optimization of the incorporation of acetolactate in overexpressed protein.

Initial experiments to determine the level of acetolactate incorporation into overexpressed proteins were performed using ubiquitin as a model system. E. coli BL21(DE3) cells were transformed with a pET41c plasmid carrying the human His-tagged ubiquitin (pET41c-His-Ubi) gene and transformants were grown in M9/D₂O media containing 1 g/L ¹⁵ND₄Cl, and 2 g/L of U-[²H], U-[¹³C], glucose. When the optical density (O.D.) at 600 nm reached 0.8, a solution containing unlabeled acetolactate was added. After an additional 1 h, protein expression was induced by the addition of IPTG to a final concentration of 1 mM. Induction was performed for 3 hours at 37° C. Ubiquitin was purified by Ni-NTA (Qiagen) chromatography in a single step.

The optimal quantity of acetolactate required to achieve near complete incorporation in the overexpressed protein was assessed in a series of cultures (90 mL each) in which different amounts of unlabeled precursor were added 1 hour prior induction to final concentrations of 0, 100, 200, 300 and 800 mg/L. The level of incorporation into the purified protein was monitored by directly-detected ¹³C 1D NMR. When the precursor is incorporated into the overexpressed protein, the ¹³C-L,V residues are replaced by amino acids with ¹²C side chains. The quantification was performed by comparing the integral of signals of 4 isolated Leu/Val methyl resonances (19-21 ppm) with respect to the signals of the methyl groups of Ile, Ala (between 9-19 ppm). The addition of 300 mg of pure acetolactate per liter of M9/D₂O culture medium achieves an incorporation level of 95% in Leu/Val side chains without detectable scrambling to other amino-acid biogenesis pathways

Production of U-[²H], U-[¹⁵N], Leu/Val-[¹³C¹H₃]^(proS) proteins.

E. coli BL21(DE3) carrying the plasmid of the overexpressed protein (TET2 or MSG) were progressively adapted, in three stages, over 24 h, to M9/D₂O media containing 1 g/L ¹⁵ND₄Cl and 2 g/L D-glucose-d₇ (Isotec). In the final culture, the bacteria were grown at 37° C. in M9 media prepared with 99.85% D₂O (Eurisotop). When the O.D. (600 nm) reached 0.8, a solution containing 2-(¹³C)methyl-4-(²H₃)-acetolactate (compound of formula 4) (prepared with the protocol described above) was added. Acetolactate was added to the culture medium to a final concentration of 300 mg/L. 1 hour later, TET2 (/MSG) expression was induced by the addition of IPTG to a final concentration of 1 mM (/0.1 mM). Expression was performed for 3 hours (/12 hours) at 37° C. (/20° C.) before harvesting. For MSG, ¹³C spectra were recorded at 37° C. in D₂O on a NMR spectrometer operating at a proton frequency of 600 MHz. Only signals for Leu and Val methyl carbons were observed in ¹³C spectra, indicating that ¹³C¹H₃ groups of acetolactate were not incorporated in metabolic pathway of other amino-acids.

Production of U-[²H], U-[¹⁵N], Ile-[¹³C¹H₃]^(γ2) proteins.

E. coli BL21(DE3) carrying the plasmid of the overexpressed protein (TET2 or MSG) were progressively adapted, in three stages, over 24 h, to M9/D₂O media containing 1 g/L ¹⁵ND₄Cl and 2 g/L D-glucose-d₇ (Isotec). In the final culture, the bacteria were grown at 37° C. in M9 media prepared with 99.85% D₂O (Eurisotop). When the O.D. (600 nm) reached 0.8, a solution containing 2-(²H₅)ethyl-2-hydroxy-3-oxo-4(¹³C)butanoate (compound of formula 6) (prepared with the protocol described above) was added. Product was added to the culture medium to a final concentration of 300 mg/L. 1 hour later, TET2 (/MSG) expression was induced by the addition of IPTG to a final concentration of 1 mM (/0.1 mM). Expression was performed for 3 hours (/12 hours) at 37° C. (/20° C.) before harvesting.

Production of U-[²H], U-[¹⁵N], Leu/Val-[¹³C¹H₃, ¹²C²H₃] proteins.

For aim of comparison, the production of perdeuterated proteins with non-stereospecific ¹³C¹H labeling of Leu and Val methyl groups was achieved using the protocol used before the invention, i.e. the protocol described by V. Tugarinov et al., J. Biomol. NMR 2004, 28, 165-172 and R. Lichtenecker et al., J. Am. Chem. Soc. 2004, 126, 5348-5349.

This protocol is the protocol described above but with the addition 1 hour prior induction of 125 mg/L of 3-(²H₃)methyl-3-(²H)-4-(¹³C)-ketoisovalerate (Isotec) in place of 300 mg/L of labeled acetolactate (compound of formula 4).

Production of U-[²H], U-[¹⁵N], U-[¹²C], U-[¹³C¹H₃]^(proS)-Leu/Val, U-[¹³C¹H₃]-Ala proteins.

The production of perdeuterated proteins with stereospecific ¹³C¹H labeling of Leu/Val proS methyl groups and Ala-positions was achieved using the general protocol described above but with the addition 1 hour prior induction of 800 mg/L of 2-(S)-2-(²H)-3-(¹³C)-Alanine (CortecNet) together with 300 mg/L of labeled acetolactate (compound of formula 4). A 2D ¹³C-methyl TROSY spectra was recorded at 37° C. in D₂O on a NMR spectrometer operating at a proton frequency of 800 MHz. Only peaks corresponding to the expected signals of Alanine methyl groups arid proS methyl groups of Leu and Val side chains were observed indicating that labeling using acetolactate derivatives does not interference with other methyl labeling processes.

Proteins Purification.

Malate Synthase G (MSG) was purified initially by Chelating Sepharose chromatography (GE Healthcare) followed by gel filtration chromatography (Superdex 200 pg GE Healthcare). Typical final yields after purification were 60-80 mg/L of methyl specific protonated MSG. The concentration of MSG in typical NMR samples was 1 mM in 100% D₂O buffer containing 25 mM MES (pH 7.0 uncorrected), 20mM MgCl₂, 5 mM DTT. NMR data were acquired at 37° C.

TET2 was purified using two anion exchange chromatography steps (DEAE Sepharose CL-6B, and Resource Q 6 mL, GE Healthcare) followed by gel filtration (Sephacryl S-300 HR, GE Healthcare). Typical final yield after purification was 20 mg/L of methyl specific protonated TET2. Samples prepared in this manner were demonstrated to be fully active (measured by hydrolytic activity using Leu-4-nitroanitide). The final NMR samples of TET consisted of ˜80 μM TET2 dodecamer (˜1 mM monomer) in 20 mM Tris (pH 7.4 uncorrected), 20 mM NaCl dissolved in 300 μL D₂O. NMR data were acquired at 50° C.

3. NMR Spectroscopy.

EXPERIMENTAL DETAILS

All ¹H and ¹³C 1D NMR spectra of ubiquitin and MSG were recorded on a Varian DirectDrive spectrometer operating at a proton frequency of 600 MHz equipped with a cryogenic triple resonance pulsed field gradient probehead.

2D methyl-TROSY spectra were recorded on a Varian DirectDrive spectrometer operating at a proton frequency of 800 MHz equipped with a cryogenic triple resonance pulsed field gradient probehead. The ¹H-¹³C HMQC of MSG (/TET2) were recorded with 1288 (/780) complex data points in direct dimension (maximum t₂=99 ms (/60 ms)) and 512 (/380) points in carbon dimension (maximum t₁=128 ms (/47 ms)).

The 4D HMQC-NOESY-HMQC experiments were recorded on a Varian DirectDrive spectrometer operating at a proton frequency of 800 MHz equipped with a cryogenic triple resonance pulsed field gradient probehead. Data were acquired in 96 h of a 1 mM sample of MSG with a NOE mixing time of 300 ms. The experiments were collected with 20 complex points in the indirect ¹H dimension (maximum t₁=30 ms), 36 and 18 complex points in the first and second carbon dimension (maximum t₂=21 ms & t₃=11 ms), and 201 complex points in the direct dimension (maximum t₄=80 ins) and 4 scans per increment. All data were processed and analyzed using nmrPipe/nmrDraw and NMRView. Distances were quantified using a full relaxation matrix analysis of NOEs between remote protons in methyl-specific protonated proteins as described in Sounier et al., J. Am. Chem. Soc. 2007, 129, 472-473.

Comparison of methyl-TROSY spectra recorded on Leu/Val methyl-specifically labeled TET2 samples (468 kDa).

2D ¹³C-methyl TROSY spectra were recorded at 50° C. in D₂O on a NMR spectrometer operating at a proton frequency of 800 MHz for U-[²H], U-[¹²C], Leu/Val-[¹²C²H₃, ¹³C¹H₃] TET2 with non-stereospecific [¹³C¹H₃]-methyl labeling prepared using 3-(²H₃)methyl-3-(²H)-4-(¹³C)-ketoisovalerate; and for U-[²H], U-[¹²C], Leu/Val-[¹³C¹H₃]^(proS) TET2 with stereospecific labeling prepared using 2-(13C)methyl-4-(²H₃)-acetolactate (compound of formula 4). Preparation of TET2 assembly using a non-stereoselectively labeling scheme results in spectra with substantial cross-peak overlap that would have greatly hampered the observation of amastatin-induced chemical shift changes (Amastatin is an inhibitor of TET2). 

The invention claimed is:
 1. A composition consisting essentially of a compound having a formula I-1:

wherein: X is ¹H or ²H (D), each Y is independently ¹²C or ¹³C, R¹ is a methyl group whose carbon atom is ¹³C or ¹²C and whose hydrogen atoms are each independently ¹H or ²H (D), R² is (i) a methyl group whose carbon atom is ¹³C or ¹²C and whose hydrogen atoms are each independently or ¹H or ²H (D), or (ii) an ethyl group whose carbon atoms are each independently ¹³C or ¹²C and whose hydrogen atoms are each independently or ¹H or ²H (D), with the provisos that: the hydrogen atoms of R¹ and the hydrogen atoms of R² are not all, at the same time, either ¹H or ²H (D), and when all the hydrogen atoms of R¹ are ²H, then the carbon atoms, in formula I-1, are not all, at the same time, ¹²C.
 2. The composition of claim 1, wherein: R¹ is selected from the group consisting of: ¹²CH₃, ¹²CD₃, ¹³CH₃, ¹³CD₃, ¹³CHD₂, and ¹³CH₂D, and R² is selected from the group consisting of: ¹²CH₃, ¹²CD₃, ¹³CH₃, ¹³CD₃, ¹³CHD₂, ¹³CH₂D, ¹²CH₃ ¹²CD₂, ¹²CD₃ ¹²CD₂, ¹³CH₃ ¹²CD₂, ¹³CH₃ ¹³CD₂, ¹³CD₃ ¹³CD₂, ¹³CHD₂ ¹³CD₂, ¹³CH₂D¹³CD₂, ¹³CHD₂ ¹²CD₂, and ¹³CH₂D¹²CD₂.
 3. The composition of claim 1, wherein the compound having the formula I -1 is selected from the group consisting of: 2-hydroxy-2-(¹³C)methyl-3-oxo-4(²H₃)butanoic acid, 2-hydroxy-2-(²H₃)methyl-3-oxo-4(¹³C)butanoic acid, 2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(¹³C)methylbutanoic acid, 1,2,3,4-(¹³C)-2-(²H₅)ethyl-2-hydroxy-3-oxobutanoic acid, 1,2,3-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, 1,2,3,4-(¹³C)-2-(²H₃)methyl-2-hydroxy-3-oxobutanoic acid, 1,2,3-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl)-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, 3,4-(¹³C)-2-(¹³C)methyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, 3,4-(¹³C)-2-(²H₃,¹³C)methyl-2-hydroxy-3-oxobutanoic acid, 3,4-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid, 3,4-(¹³C)-2-(²H₅,¹³C₂)ethyl-2-hydroxy-3-oxobutanoic acid, and 3,4-(¹³C)-2-(1′-(²H₂),¹³C₂)ethyl-2-hydroxy-3-oxobutanoic acid.
 4. The composition of claim 1, wherein the compound having the formula I-1 is selected from the group consisting of: 2-hydroxy-2-(²H₂,¹³C)methyl-3-oxo-4-(²H₃)butanoic acid, 2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H₂,¹³C)butanoic acid, 2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H₂,¹³C)butanoic acid, and 2-(1′-(²H₂),2′-(²H),2′-(¹³C))ethyl-2-hydroxy-3-oxo-4-(²H₃)butanoic acid.
 5. The composition of claim 1, wherein the compound having the formula I-1 is selected from the group consisting of: 2-hydroxy-2-(²H,¹³C)methyl-3-oxo-4-(²H₃)butanoic acid, 2-hydroxy-2-(²H₃)methyl-3-oxo-4-(²H,¹³C)butanoic acid, and 2-(²H₅)ethyl-2-hydroxy-3-oxo-4-(²H,¹³C)butanoic acid.
 6. A process for analyzing a protein by NMR comprising labeling the protein with the composition of claim
 1. 7. A process for analyzing a protein by NMR comprising labeling the protein with the composition of claim
 2. 8. A process for analyzing a protein by NMR comprising labeling the protein with the composition of claim
 3. 9. A process for analyzing a protein by NMR comprising labeling the protein with the composition of claim
 4. 10. A process for analyzing a protein by NMR comprising labeling the protein with the composition of claim
 5. 